RUSSE'2018: A Shared Task on Word Sense Induction for the Russian Language
نویسندگان
چکیده
The paper describes the results of the first shared task on word sense induction (WSI) for the Russian language. While similar shared tasks were conducted in the past for some Romance and Germanic languages, we explore the performance of sense induction and disambiguation methods for a Slavic language that shares many features with other Slavic languages, such as rich morphology and free word order. The participants were asked to group contexts with a given word in accordance with its senses that were not provided beforehand. For instance, given a word “bank” and a set of contexts with this word, e.g. “bank is a financial institution that accepts deposits” and “river bank is a slope beside a body of water”, a participant was asked to cluster such contexts in the unknown in advance number of clusters corresponding to, in this case, the “company” and the “area” senses of the word “bank”. For the purpose of this evaluation campaign, we developed three new evaluation datasets based on sense inventories that have different sense granularity. The contexts in these datasets were sampled from texts of Wikipedia, the academic corpus of Russian, and an explanatory dictionary of Russian. Overall 18 teams participated in the competition submitting 383 models. Multiple teams managed to substantially outperform competitive state-of-the-art baselines from the previous years based on sense embeddings.
منابع مشابه
RUSSE: The First Workshop on Russian Semantic Similarity
The paper gives an overview of the Russian Semantic Similarity Evaluation (RUSSE) shared task held in conjunction with the Dialogue 2015 conference. There exist a lot of comparative studies on semantic similarity, yet no analysis of such measures was ever performed for the Russian language. Exploring this problem for the Russian language is even more interesting, because this language has featu...
متن کاملMultilingual Word Sense Discrimination: A Comparative Cross-Linguistic Study
We describe a study that evaluates an approach to Word Sense Discrimination on three languages with different linguistic structures, English, Hebrew, and Russian. The goal of the study is to determine whether there are significant performance differences for the languages and to identify language-specific problems. The algorithm is tested on semantically ambiguous words using data from Wikipedi...
متن کاملHatred as a Moral and Ethical Conception in Russian Society
The present paper deals with the national specifics of the assessment aspect in the meaning of the words. A modern scientific paradigm considers the language as a cognitive tool of understanding the world and keeping and representing people’s experience and values which reflect the people’s vision of the world (“the world picture). Usually linguistics understands the language ...
متن کاملWord sense induction using word embeddings and community detection in complex networks
Word Sense Induction (WSI) is the ability to automatically induce word senses from corpora. The WSI task was first proposed to overcome the limitations of manually annotated corpus that are required in word sense disambiguation systems. Even though several works have been proposed to induce word senses, existing systems are still very limited in the sense that they make use of structured, domai...
متن کاملunimelb: Topic Modelling-based Word Sense Induction
This paper describes our system for shared task 13 “Word Sense Induction for Graded and Non-Graded Senses” of SemEval-2013. The task is on word sense induction (WSI), and builds on earlier SemEval WSI tasks in exploring the possibility of multiple senses being compatible to varying degrees with a single contextual instance: participants are asked to grade senses rather than selecting a single s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018